OB-Fold Recognition Combining Sequence and Structural Motifs

نویسندگان

  • Martin Macko
  • Martin Králik
  • Brona Brejová
  • Tomás Vinar
چکیده

Remote protein homology detection is an important step towards understanding protein function in living organisms. The problem is notoriously difficult; distant homologs can often be detected only by a combination of sequence and structural features. We propose a new framework, where important sequence and structural features are described by the user in the form of a descriptor, and the descriptor is then used to search a database of protein sequences and score potential candidates. We develop algorithms necessary to support such search using support vector machines and discrete optimization methods. We demonstrate our approach on the example of the telomere-binding OB-fold domain, showing that not only we can distinguish between Telo_bind family members and negatives, but we also identify proteins from related protein families carrying similar OB-fold domains. Prototype implementation of the descriptor search software is available for Linux operating system at http://compbio.fmph.uniba.sk/descal/

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relative Stabilities of Conserved and Non-Conserved Structures in the OB-Fold Superfamily

The OB-fold is a diverse structure superfamily based on a beta-barrel motif that is often supplemented with additional non-conserved secondary structures. Previous deletion mutagenesis and NMR hydrogen exchange studies of three OB-fold proteins showed that the structural stabilities of sites within the conserved beta-barrels were larger than sites in non-conserved segments. In this work we exam...

متن کامل

MegaMotifBase: a database of structural motifs in protein families and superfamilies

Structural motifs are important for the integrity of a protein fold and can be employed to design and rationalize protein engineering and folding experiments. Such conserved segments represent the conserved core of a family or superfamily and can be crucial for the recognition of potential new members in sequence and structure databases. We present a database, MegaMotifBase, that compiles a set...

متن کامل

Common fold in helix-hairpin-helix proteins.

Helix-hairpin-helix (HhH) is a widespread motif involved in non-sequence-specific DNA binding. The majority of HhH motifs function as DNA-binding modules, however, some of them are used to mediate protein-protein interactions or have acquired enzymatic activity by incorporating catalytic residues (DNA glycosylases). From sequence and structural analysis of HhH-containing proteins we conclude th...

متن کامل

Common recognition principles across diverse sequence and structural families of sialic acid binding proteins.

Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural anal...

متن کامل

iMOTdb—a comprehensive collection of spatially interacting motifs in proteins

Realization of conserved residues that represent a protein family is crucial for clearer understanding of biological function as well as for the better recognition of additional members in sequence databases. Functionally important residues are recognized well due to their high degree of conservation in closely related sequences and are annotated in functional motif databases. Structural motifs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016